A Direct Estimation Approach to Sparse Linear Discriminant Analysis

نویسندگان

  • Tony Cai
  • Weidong Liu
چکیده

This paper considers sparse linear discriminant analysis of high-dimensional data. In contrast to the existing methods which are based on separate estimation of the precision matrix Ω and the difference δ of the mean vectors, we introduce a simple and effective classifier by estimating the product Ωδ directly through constrained l1 minimization. The estimator can be implemented efficiently using linear programming and the resulting classifier is called the linear programming discriminant (LPD) rule. The LPD rule is shown to have desirable theoretical and numerical properties. It exploits the approximate sparsity of Ωδ and as a consequence allows cases where it can still perform well even when Ω and/or δ cannot be estimated consistently. Asymptotic properties of the LPD rule are investigated and consistency and rate of convergence results are given. The LPD classifier has superior finite sample performance and significant computational advantages over the existing methods that require separate estimation of Ω and δ. The LPD rule is also applied to Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104, [email protected]. The research was supported in part by NSF FRG Grant DMS-0854973. Department of Mathematics and Institute of Natural Sciences, Shanghai Jiao Tong University, China.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Parametric Simplex Approach to Statistical Learning Problems

In this paper, we show that the parametric simplex method is an efficient algorithm for solving various statistical learning problems that can be written as linear programs parametrized by a so-called regularization parameter. The parametric simplex method offers significant advantages over other methods: (1) it finds the complete solution path for all values of the regularization parameter by ...

متن کامل

A Note On the Connection and Equivalence of Three Sparse Linear Discriminant Analysis Methods

In this paper we reveal the connection and equivalence of three sparse linear discriminant analysis methods: the `1-Fisher’s discriminant analysis proposed in Wu et al. (2008), the sparse optimal scoring proposed in Clemmensen et al. (2011) and the direct sparse discriminant analysis proposed in Mai et al. (2012). It is shown that, for any sequence of penalization parameters, the normalized sol...

متن کامل

Communication-efficient Distributed Sparse Linear Discriminant Analysis

We propose a communication-e cient distributed estimation method for sparse linear discriminant analysis (LDA) in the high dimensional regime. Our method distributes the data of size N into m machines, and estimates a local sparse LDA estimator on each machine using the data subset of size N/m. After the distributed estimation, our method aggregates the debiased local estimators from m machines...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

A direct approach to sparse discriminant analysis in ultra-high dimensions

Sparse discriminant methods based on independence rules, such as the nearest shrunken centroids classifier (Tibshirani et al., 2002) and features annealed independence rules (Fan & Fan, 2008), have been proposed as computationally attractive tools for feature selection and classification with high-dimensional data. A fundamental drawback of these rules is that they ignore correlations among fea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011